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Abstract 

We consider a broadcast channel with a degraded message set, in which a single transmitter 

sends a common message to two receivers and a private message to one of the receivers only. 
The main goal of this work is to find new lower bounds to the error exponents of the strong 
user, the one that should decode both messages, and of the weak user, that should decode 
only the common message. Unlike previous works, where suboptimal decoders where used, the 
exponents we derive in this work pertain to optimal decoding and depend on both rates. We 
take two different approaches. 

The first approach is based, in part, on variations of Gallager-type bounding techniques 
that were presented in a much earlier work on error exponents for erasure/list decoding. The 
resulting lower bounds are quite simple to understand and to compute. 

The second approach is based on a technique that is rooted in statistical physics, and it 
is exponentially tight from the initial step and onward. This technique is based on analyzing 
the statistics of certain enumerators. Numerical results show that the bounds obtained by 
this technique are tighter than those obtained by the first approach and previous results. The 
derivation, however, is more complex than the first approach and the retrieved exponents are 
harder to compute. 

Index Terms: broadcast channel, random coding, error exponents. 

1 Introduction 



In the broadcast channel (BC), as introduced by Cover [1], a single source is communicating to two 
or more receivers. In this work, we concentrate on the case of two receivers. The encoder sends a 



common message, to be decoded by both receivers, and a private message for each decoder. In the 
case of a degraded message set, one of the private messages is absent. The capacity region of the BC 
with a degraded message set was found in [2] . A coding theorem for degraded broadcast channels 
was given by Bergmans [3] and the converse for the degraded channel case was given by Gallager 
[4]. Bergmans suggested the use of a hierarchical random code: First draw "cloud centers". Next, 
around each "cloud center" , draw a cloud of codewords. The sender sends a specific codeword from 
one of the clouds. The strong decoder (the one with the better channel) can identify the specific 
codeword while the weak decoder can only identify the cloud it originated from (see Section II and 
[3]). 

The error exponent is the rate of exponential decay of the average probability of error as a 
function of the block length. Unlike in the single user regime, where the error exponent is a 
function of the rate at which the transmitter operates, in the multiuser regime, the error exponent 
for each user is a function of all rates in the system. We can define an error exponent region, that 
is, a set of achievable error exponents for fixed rates of both users (see [5]). The tradeoff between 
the exponents is controlled by the choice of the random coding distributions. 

Earlier work on error exponents for general degraded broadcast channels includes [4] and [6]. 
Both [4] and [6] used the coding scheme of [3], but did not use optimal decoding. In [4], a direct 
channel from the cloud center to the weak user is defined and the error exponent is calculated for this 
channel. By defining this channel, the decoder does not use its knowledge of the refined codebook 
of each cloud. The resulting exponent depends only on one of the rates - the one corresponding 
to the number of clouds. When the clouds are "full" (high rate of the private message) , not much 
is lost by the use of the defined direct channel. However, for low rates of the private message, the 
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decoding quality can be improved by knowing the codebook. In [6], universally attainable error 
exponents are given for a suboptimal decoder. Lower and upper bounds to the error exponents, 
that depend on both rates, are given. 

In this work, we derive new lower bounds to the error exponents for both the weak and the 
strong decoder of a degraded BC with degraded message sets. The derived exponents pertain to 
optimum decoding and they depend simultaneously on both rates. We present two approaches 
to derive the exponents, which start from the same initial step, but are substantially different 
otherwise. 

The first approach is based, in part, on variations of Gallager-type bounding techniques along 
with refinements that were used in Forney's work on error exponents for erasure/list decoding [7]. 
Using these techniques, we derive new lower bounds which are quite simple to understand and 
compute. Both this approach and the approach of [4] use Jensen's inequality, as well as other 
inequalities, which possibly risk the tightness of the obtained bounds in the exponential scale. 

Our second approach avoids the use of these inequalities. Instead, an exponentially tight eval- 
uation of the relevant expressions is derived by assessing the moments of a certain type class 
enumerators. The underlying ideas behind the second approach are inspired from the statistical 
mechanical point of view on random code ensembles [8], [9]. The analysis tools we use in this 
approach are applicable to other problem settings as well, e.g., [10] and [11], where they lead to 
tighter bounds than those of other methods previously used. The second approach, after its initial 
step, is guaranteed to be exponentially tight, and is shown to obtain tighter bounds than the first 
approach and previous results. However, this tightness comes at the price of the complexity of 
both the derivation and the final results, which makes the task of obtaining numerical results quite 
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involved. 

The outline of the remaining part of this work is as follows: Section 2 gives the formal setting 
and notation. In Section 3 we summarize the main results of this paper, giving the resulting 
exponents of each of the approaches, in Sections 4 and 5, we derive the exponents using the first 
and second approach, respectively. At the end of each of the sections, we give numerical results for 
the degraded binary symmetric channel (BSC) . We conclude our work in section VI. 

2 Preliminaries 

We begin with notation conventions. Capital letters represent scalar random variables (RVs) and 
specific realizations of them are denoted by the corresponding lower case letters. Random vectors 
of dimension n will be denoted by bold-face letters. Indicator functions of events will be denoted 
by We write [x\^ for the positive part of a real number x, i.e [x]"*" = max(x,0). The 

expectation operator will be denoted by E{-}. When we wish to emphasize the dependence of 
the expectation on a certain underlying probability distribution, say, Q, we subscript it by Q. i.e. 
Eq{-}. We consider a memoryless broadcast channel with a finite input alphabet X and finite 
output alphabets y and of the strong decoder and the weak decoder, respectively, given by 
P{y,z\x) = Ylt=i-^iyty^t\xt), {x,y,z) G A"" x y"' x Z". We are interested in sending one of 
Myz = e^^'' messages to both receivers and one of My = e^^y to the strong receiver, that observes 

y- 

Consider a random selection of a hierarchical code [3] as follows: First, Myz = e^^^/^ "cloud cen- 
ters" Wi, . . . , UMy^ G U"' are drawn independently, each one using a distribution P{u) = HtLi ^i^t), 
where u G W is an auxiliary random variable. Then, for each m = 1,2,..., Myz, My = e"^ code- 
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words Xm,i, ■ ■ ■ , Xm,My € X'^ are drawn according to P(a;|tt) = HtLi P{^t\ut), with u = Um- 

The strong decoder is interested in decoding both indices (m, i) of the transmitted codeword 
Xm,i, whereas the weak decoder, the one that observes z, is only interested in decoding the in- 
dex m. Thus, while the strong decoder best applies full maximum likelihood (ML) decoding, 
{rh{y),i{y)) = argmax^^j Pi(t/|cCm_i), the best decoding rule for the weak decoder is given by 
m{z) = argmax^ ^ YlfLl P3iz\xm,i), where Psizlx) = HLi Psiztlxt) = l{t=i Ej, ^iv^ zt\xt). 

The capacity region for a BC with degraded message sets is given [2] by the closure of: 

{Ry„ Ry-. Ry,< I {U ^ Z) , Ry < I {X ^ Y\U), Ry, +Ry< I {X ^ Y)} 

for some P{u,x,y, z) = P{u)P{x\u)P{y, z\x) and \U\ < \X\ + 2. If the channel is degraded, since 
we have U X Y <r^ Z, the restriction on the sum of rates is trivially satisfied and can be 
omitted. The capacity region for the general BC is still an open problem. The best inner bound 
for it is given by Marton [12] and, in a simpler manner, by El Gamal and Meulen [13]: 

{Ry„Ry : Ry,<I{U;Z),Ry<IiV,Y),Ry, + Ry<I{U;Z)+I{V;Y)-I{U;V)} 

for some p{x, u, v), where u, v are auxiliary random variables with finite ranges. 

Denote the average error probability of the strong decoder by 
P| = Pr / (m, i)| and the average error probability of the weak decoder by P|, = 

Pr{rh{z) ^ m}. The exponents of the strong and weak decoders will be denoted by Ey and E^, 
respectively. A pair {Ey, E^) is said to be an attainable pair in the random coding sense, for a given 
{Ry,Ryz), if there exist random coding distributions {P{u)} and {P(a;|n)} such that the random 
coding exponents satisfy Ey < lim inf „_>oo — ^ log P| and E^ < liminf„_>oo — ^ logPf, where all 
logarithms throughout the sequel are taken to the natural base. For a given pair {Ry, Ryz)i we say 



5 



that Ez is an attainable exponent for the weak user if there there exists Ey > such that the pair 
{Ey, Ez) is attainable in the random coding sense. 

3 Main Results 



In this section, we outhne the main results of this paper. As described in the Introduction, we use 
two different approaches to derive the error exponents of a general degraded broadcast channel, 
pertaining to optimal decoding. We introduce the resulting exponents of each of these approaches 
in the following two subsections. 

3.1 Gallager-type bound 

Denoting f{a,b,z) = X^u-P(tf) P{x\u)P3{z\x)"-^'']^ , we define: 



Eo{p,X,a,iS) 



log 



XI /(^ ~ pA,a,2;) • f{X,l^,z) 



-pi?^-logXXP(K) 



y u 



Ey{Ry, Ryz,p) = -p{Ry + Ryz) - log < 



^P{x\u)Pi{y\x) i+p 

X 



(1) 



Let 



Ez,i{Ryz, Ry) = ^, 1 ^1 i^oip, A, a, n)-{a + pn- l)Ry - pRy^} 

0<p<l,0<A</i<l,l— pA<a<l 

Ey,i{Ryz, Ry) = min ( ma^^ E^iRy, p),^sx^ E^{Ry, Ryz, p) 



(2) 



The first main result of this paper is the following theorem. 
Theorem 1: For the degraded broadcast channel defined in Section II, the pair 
{Ez,i{Ryz,Ry),Ey^i{Ryz,Ry)), as defined in eq. (2), is an attainable pair in the random coding 



sense. 
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We prove this theorem in Section 4. UnHke in earher papers [4], [6], [5], the exponents of 
Theorem 1 pertain to optimal decoding and depend on both rates. For the weak decoder exponent, 
the optimization on all parameters, although possible, is hard computationally. We therefore 
examine a few interesting choices of the parameters, in order to reduce the dimensionality of the 
optimization process. 

1. Let a = iJ,. In this case, we show in Appendix A.l that VA : Eq{p, i^,a,a) > Eo{p, X,a,a), 
thus, the choice of A = is optimal. Applying a = fi,\= J^- our bound becomes: 



This is a somewhat more compact expression with only two parameters. Numerical results indicate 
that, at least for the BSC we tested, the choice a = p is the optimal choice. However, we do not 
have a proof that this is true in general. 

2. As a further restriction of item no. 1 above, consider the choice a = p = In this case, the 
expressions in the inner-most brackets of (17) and (18) become ^^Q{x\u)P3{z\x) = P4,{z\u), and 
a + p/i — 1 = 0. Thus, we get an exponent given by 



which is exactly the ordinary Gallager function for the channel P{z\u), obtained by sub-optimal 
decoding at the weak user [4], ignoring the knowledge of the refined codebook of each cloud center. 
This means that the exponents of Theorem 1 are at least as tight as the result of [4]. Numerical 

results show that, at least for the degraded BSC case, the exponents of Theorem 1 are tighter. 




- [a(l + p) - l]Ry - pR, 



(3) 
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3. Another further restriction of item no. 1 is the choice a = = 1, which gives: 



log < 



A 



(5) 



This corresponds to i.i.d. random coding according to Q{x) = Q{u)Q{x\u) at rate Ry + Ryz- 
3.2 A bound beised on Type class enumerators 

Let {X, U, Y, Z) be a quadruplet of random variables, taking values in X x U x y x Z, and being 
governed by a generic joint distribution QxuYZ = {QxuYz{x,u,y, z), x E X, u eU, y & y z E 
Z}, where, as introduced in Section 2, X,y,Z are, respectively, the channel input and output 
alphabets and U is the alphabet of the auxiliary random variable which is of finite cardinality. Let us 
denote the various marginals and conditional distributions derived from QxuYZ, using the standard 
conventions, e.g., Qx is the marginal distribution of X, Qu\z is the conditional distribution of U 
given Z, etc. Expectation w.r.t. QxuYZ, or Q for short, will be denoted by Eq. Similarly, 
information measures, like entropy and conditional entropy induced by Q, will be subscripted by 
Q, e.g., Hq{X\U, Z) is the conditional entropy of X given U and Z under Q = QxuzY- In the 
following description, we allow various joint distributions {Q} to govern {X, U, Y, Z). 

Let Qy, Qz be given. We define Q{Ry, Qu\z) to be the set of conditional distributions {Qx\u,z} 
that satisfy Ry + EQ\ogP{X\U) + HQ{X\U, Z) > 0, where, as described in Section 2, P{x\u) is the 
random coding distribution according to which the codewords {xm,i} are drawn given Um- Similarly, 
let Q{Ry, Qu\y) be the set of conditional distributions {Qx\u,y} that satisfy Ry + Eq log P{X\U) + 
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Hq{X\U,Y) > 0. Next define, 

= (1 - M) max [Eq log P(X|i7)+ 

Qx\uz^S(Ry,Qu\z) 

HQ{X\U,Z) + EQlogP3{Z\X)] (6) 
P{Qu\z) = P^^Ry + max [Eq log P(X|C/)+ 

Qx I [/Z 60'' (-Rj/ ,<9(7| z) 

i7Q(X|C/, Z) + il- p\)Eq logP3(^|^)] , 
Eap{Qu\z) = max{a(Q[/|2),/3(g[/|2)}. (7) 
where, as described in Section 2, P3(-|-) is the overall channel to the weak user. Similarly, define: 

l{Qu\Y) = P\Ry + ^ max [EQ\ogP{X\U)+ 
\ Qx\u,Y^y(Ry,Qu\Y) 

HQiX\Y, U) + \EQ\ogP^{Y\X)]) (8) 

aQu\Y) = Ry + ^ max \EQ\ogP{X\U)+ 
Qx\u,Y&y''{Ry,Qu\Y) 

Hq{X\U, Y) + ipX)EQ log P{Y\X)] (9) 
E^dQu\z) = maxHQu\z), C(Qc/|z)}- (10) 



Also, define 



fh{Quiz) = Ryz + Hq{U\Z) + EQlogP{U) 



where, as said, {P{u)} is the random coding distribution of the cloud centers {ttm}- Now, 

NiQx\z, Qu\z, Ry) = Ry+ max [Eq log PiX\U)+ 

Qx\uz 

Hq{X\U,Z)\, (11) 
where the maximization is over all {Qx\uz} ^^sX are consistent with Qx\z- Next, we define 
Qz{Ryz) = {Qu\Z ■■ Ryz + Hq{U\Z) + ElogP{U) > 0}, 

B{Qx\Z, Qu\Z, Ry) = pN{Qx\Z, Qu\Z, Ry) " X^{^^Q^\^'Qmz,Rv)>0} (12) 
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and 



CiQx\z,Qu\z,Ry) = NiQx\z,Qu\z,Ry) ■ (pAf{^(«^i^''2^i^'^-)>o}, 



(13) 



We similarly define Qy{Ryz)-,N{Qx\Y-,Qu\YiRy) ^-^d m(Q[/|y) by replacing the respective role of 
ZhyY. Next define 



D{Qx\Y,Qu\Y.Ry) = NiQx\Y,Qu\Y,Ry) ' ,i?.)>0} ^ 



(14) 



We also define 



^(Qx|z) = max< max [B{Qx\z,Qu\z,Ry)+ 

\Qu\Z^yz(.Ryz) 

pm{Qu\z)\, max \C[Qx\z, Qu\z, Ry) + "^(Q^7|z)] ) , 
^(Qx|y) = max i p max [N{Qx\y,Qu\y, Ry) + fri{Qu\Y)], 
max [D{Qx\Y, Qu\Y, Ry) + rh{Qu\Y)] \ , 



Ei{Qz, Ry, Ryz, P, A) = min 

Qu\z 

E2{Qz, Ry, Ryz, P, A) = min 

Qx\z 



Eq log ^ - HQiU\Z) - E^p{Qu\z) 



pAlog 



A 



E2,{Qy, P, A) = min Eq log 

Qx,U\Y L 



P3{Z\X) 
1 



E{Qx\z) + P^Ry 



HQ{X,U\Y) + {l-p\)EQ\og 



A 



P{U,X) 

Eq log ^ - E,^{Qu\y) - H{U\Y) 
E-,{Qy, Ry, Ryz, P, A) = min XpByx log p^^l^^x) " ^^^^\^^ 



PiY\X) 



Ei{QY,Ry,Ryz,p,>^) = min 

Qu\Y 
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Finally, 

Ez,2{Rvz, Ry) = max max mmlEUQz, Ry, Ryz, P, A)+ 

p>0 0<X<l/p Qz 

E2{Qz,Ry,Ryz,pA)-HQ{Z)]. 

Ey,2{Ryz,Ry) = maxmaxmm[i?3((5y,p, A) + msLK{E4^{QY , Ry, Ryz, P, ^) , 
p>0 A>0 Qy 

E^{QY,Ry,Ryz,P,X)}-HQ{Y)]. (15) 

The second main result of this paper is given in the following theorem: 
Theorem 2: For the degraded broadcast channel defined in Section II, the pair 
{Ezfl{Ryz,Ry),Ey^2{Ryz,Ry)), as defined in eq. (15), is an attainable pair in the random coding 
sense. 

These exponents also pertain to optimal decoding and they depend on both rates. Unlike the 
exponent of Theorem 1, where the weak decoder exponent had four free parameters, here, Ez^2 
has only two free parameters (A, p). Moreover, {Ez,2{Ryz, Ry), Ey{Ryz, Ry)) are at least as tight 
as the exponents of the previous section since, as we will see in the following, their derivation is 
exponentially tight after the same initial step we take in the proof of Theorem 1. Numerical results 
show that Ez,2 is tighter, at least for the binary symmetric case. 

4 Derivation of the Gallager Type Bound 

In this section we prove Theorem 1. 
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4.1 The Weak Decoder 



Applying Gallager's general upper bound [14, p. 65] to the "channel" P{z\m) = ^ J2i=i Ps{z\xm,i), 
we have for A > 0, p > 0: 



^ My 



i=l 



1-pX 



m'^m \ j=l 



Thus, the average error probability w.r.t. the ensemble of codes is upper bounded in terms of 
the expectations of each of the bracketed terms above (since messages from different clouds are 
independent). Define: 



My 

j7-E^3{z|X„,.) 



y i=i 



i-p\' 



My 



As for A, we have 



A = El 



My 



1=1 



i-py 



My 



i=l 



= Mf-'^ ■ P{u) ■ E 
u 

< MP^-^ ■ P{u) ■ E 
u 



My \ (l-f-^)/" 



\U 



My 



i=i 



\u 



a > 1 — pX 



u 



^P(a;|w)P3(z|a;)(i-^^)/" 



X 



a < 1 



(16) 
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For a memoryless channel and Q{u), Q{x\u) as defined in Section 2, we have 

n 

= M^+f>^-' . J2 P{n) ■ Yl n P{xt\ut)P3{zt\xt)^'-'^^/^ 

U I X t=l 

n 

= M^+p^-' ■ J2 p{u) ■ n E pix\ut)P3izt\x)^'-''^y'' 

u lt= 

n 

U t=l 



.t=l X 



t=l \ u 

Regarding B, we similarly obtain: 



Y,P{Ant)PM\x)^^'''^^''^ 

Y,P{Au)PM\x)^^~'^^''" 



B = E I 



m'^m \ ^ j=l 



My 



^{ E E^3(^ix^',,) 



< /9< 1 



— y yz 



y yz 



— y yz 



E I 



y~]P3(zi-x"^/j) 



r /r 



I V 



E^3(;^l^ 



El [Y.PMXm',)'''^ 



At > A 



— 2/ 2/-2 



t=i 
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Denoting f{a,b,z) = X^uQ('u) [^xQi^\'"')P3{z\x)"-^'']'' , we obtain: 

P| < M^+Pi'-^MP, X 1^ /(I - pA, a, z) • nX, /X, z)^ 



-n[Eo{p,\,a,fj,)—{a+pfj,— l)Ry—pIly 



(19) 



where 



Eo{p,X,a,ii) 



-log 



^/(l-pA,a,2;)-/(A,/x, z) 



After optimizing over all free parameters, we get PJ. < eyip{—nE{Ry,Ryz)}, where 



(20) 



E{Ry, Ryz) = „^ max , ^, {^o(p, A, a, /i) - (a + p/x - l)Ry - pRy^} (21) 

which is the weak decoder exponent of Theorem 1. 



4.2 The Strong Decoder 



The strong decoder (Y decoder) has to decode correctly both indices (m, i) of the transmitted Xjn,i- 
Applying Gallager's bound [14, p. 65], and assuming, without loss of generality, that (m, i) = (1, 1) 
was sent, we have for A > 0, p > 0: 



My 



y 



1=2 



m=2 i=l 



P<1 



< ^Pi(y|a=i,i)i-V 

- PEyl + PEy2 



Y,PMx,,i)^\ + j;5^Pi(yK 



, m=2 i=l 



(22) 



The two resulting expressions deal, respectively, with two separate error events: 



14 



1. The Y decoder chose a different private message from the correct cloud. 



2. The Y decoder chose a message from a wrong cloud. 



The first expression was treated in [4]. We have: Pe ^ < 2-''^y'''^^'P\ where, 



Eyi{Ry,p) — -pRy 



y u 



^Q{x\u)Pi{y\x) 1+^ 



We now turn to the second term in (22). 



y 



m=2 1=1 

Here, when averaging over the ensemble, since the term in brackets of (24) originates from a 



cloud, it is independent of the first term. Thus, 



PE,, = J2E[p,{y\X,,^y-^^ 

y 

<^E[p,{y\X,,,)'-^'^ 



E 



Myz My 

m=2 i=l 



y 



Y,P{x)Pi{y\xf~^' 



=2 i=l 

My^ My 



X 



m=2 i=l X 



X 



Y,Q{^)Pi{y\x) 



X 



Selecting ^ A = yields 



y 



Y,P{x)Pi{y\xy+p 



X 



For a memoryless channel, we get: 



E 



'^P{x)Pi{y\x) i+p 



_ 2-nEy2{Ry,Ryz,p) 

'^This choice is optimal for the same reason it is optimal in the single user regime, see [15] Prob. 5.6 
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where 

Ey2{Ry, Ryz, p) = -p{Ry + Ryz) 
- log < 



1+P 



^P{x)Pi{y\x) 

y 

Note that this corresponds to the random coding exponent for the channel X sX rate Ry + Ryz ■ 



To summarize, we have: 

l^{Ry Ryz) < 2~'^'^'^^°<f<^ ^"^^^^yf^ 

_|_ 2-"maxo<|0<i EY2{Ry,Ryz,p) 

Taking the dominant exponent of the above sum yields the strong decoder exponent of Theorem 1 . 
4.3 Numerical Results for the Degraded BSC 

In this section, we show some numerical results of our error exponents and compare them to the 
exponents that were derived in [4] . Our setup is that of a binary broadcast channel with a binary 
input X and separate binary symmetric channels to Y and Z with parameters Py,Pz {Py < Pz < ^) 
respectively. This channel can be recast into a cascade of (degraded) binary symmetric channels 
with parameters py, a, where a = p{z 7^ y) = fz^^- In this case, the auxiliary random variable U 
is also binary. By symmetry, U is distributed uniformly on {0, 1} and connected to X by another 
BSC with parameter /3 (see Fig. la). The capacity region is given by [16]: 

Rz <l-h{l3*pz) 

Ry < h{P * Py) - h{Py) 

where f3*p = f5{l — p) + {1 — f3)p and h{x) is the binary entropy function given by —x log x — (1 — 
x) log(l — x) for < X < 1. 
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■J'r 




u 




(a) 



(b) 



Figure 1: (a)The recast channel with the auxihary variable. (b)The capacity region Ryz{Ry) with 
= 0.05, = 0.3 

Denote the exponents of [4], calculated for this model, by Eg^y,Eg^z for the strong and weak 
decoder, respectively. For a general channel, Eg^z is given by (4). Eg^y is the minimum between 
(23) and 



max < 

p 



log 



y u 



pR 



■yz 



(27) 



For given Ry and Ryz, P controls the tradeoff between the exponents {Ey, E^). For example, if we 
are interested in finding the attainable pair (Ey,Ez) with maximal E^ for a given pair (Ry,Ryz), 
the maximizing /? will be the smallest /5 s.t. Ey is positive, i.e., the value of f3 that maximizes 
1 — H{P * Pz) while keeping Ey > 0. In Fig. 5, we show the best attainable (maximized over /3) 
Ey{Ry) for a given Ryz and the best attainable Ez{Ryz) for a given Ry compared to Egy{Ry) and 
Eg^z{Ryz)- In both cases the new exponents are better. 

Note that the exponent value vanishes when the operating point is outside the capacity region 
(see Fig. lb). The reason for this is that in Fig. 5a and Fig. 2b, we allowed the error exponents of 
the strong and weak decoders respectively, to be arbitrarily small. This allowed us to get arbitrarily 
close to the capacity region curve. 
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0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 



(a) (b) 

Figure 2: Comparing Ey,Ez (solid curves) to Eg^y, Eg ^z{<^otted curves) maximized over (3. (a) 
Ez{Ryz) vs Eg^2{Ryz) for a fixed Ry = 10"^. {h)Ey{Ry) vs Eg^i{Ry) for fixed Ry^ = 0.005 

Althougli the values of Ez and E'g^x in Fig. 5a are close, in the numerical calculation, it turned 
out that a = ji ^ said above that in this case, the maximizing A equals Therefore, 

since different parameters maximized Ez then the parameters in (4), the new exponent is strictly 
larger than the exponent in [4] for all Ryz and the given Ry as long as Ryz < 1 — h{pz)- 

Denote the maximal value^ of Ey , Ez by Ey^^^ , Ez^^^ respectively. In Fig. 3 we repeat the 
calculation of Fig. 5. However, here we restrict Ey > £'^™ = Ey^^^/4,Ez > = -E'^^^^/4 in 

Fig. 3a and Fig. 3b, respectively. This time the exponents vanish deep inside the capacity region. 

The reason for the singular points of Ey in Fig. 2b and Fig. 3b is the behavior of Ez as 
a function of /? (illustrated in Fig. 4). Note that as /3 increases, the channel U ^ Z becomes 
noisier. Therefore Ez{Ryz, Ry) is non increasing in p. For a given {Ryz,Ry) there is a critical 
value, /?c, such that for every f3 > /3c, Ez{Ry, Ryz, P > /3c) = Ezg^Ry, Ryz) is constant and has the 
form of (5), which is the single user error exponent ([14] p. 65) for the channel X ^ Z at rate 

Ry + Ryz- If Ezg{Ry, Ryz) is greater than the threshold (for example Ezq > Ez^^^/4: in Fig. 3b) 

■^The maximal value is the single user error exponent ([14] p. 65) for the channel h'om X to F and from X to Z 
for the strong and weak decoders respectively, i.e for a given Ryz, the maximal value for Ez is obtained with Ry — 0. 
For a given Ry the maximal Ey is obtained with Rz = 0, l3 = 0.5 
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(a) (b) 

Figure 3: Comparing Ey,Ez (solid curves) to Eg^y, Eg ^z{<^otted curves) maximized over (3. (a) 

Ez{Ryz) vs Eg^2{Ryz) for a fixed Ry = lO^^ with Ey > Ey^^jA. {h)Ey{Ry) vs Eg^i{Ry) for fixed 
Ry, = f).mb with i?, > i?,_/4 



then the maximization over Ey{Ry, Ryz) is unconstrained and is attained by /? = 0.5. However, as 
Ry increases, Ezf^(Ry, Ryz) decreases and at some critical Ry^, Ez^iRy^, Ryz) becomes smaller than 
the threshold (Illustrated in Fig 4.b). 




(a) (b) 

Figure 4: Illustration of Ez as a function of (5. (a) for some Ry < Ry^. Ez is above the threshold, 
(b) for Ry > Ry^. 

Thus, for Ry > Ry^, the maximization of Ey becomes constrained and the largest valid /? is 
much smaller than 0.5. Hence the sudden drop in the value of Ey. This phenomenon is not seen in 
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Eg^y since Eg^z does not depend on Ry and the maximizing /3 is the same for all Ry. 



5 Derivation for the Type Class Enumerators Approach 



In this section, we prove Theorem 2. Throughout, we rely on the method of types [17]. We start 
with the notation we use in this section. 

The empirical distribution pertaining to a vector x G X"- will be denoted by Qx and its type class 
by Tx- In other words, Qx = {qx{0')j o, G X}, where qx{o) = nx{a)/n, nx{a) being the number 
of occurrences of the letter a in x. Similar conventions apply to empirical joint distributions of 
pairs of letters, (a, b) e X xy, extracted from the corresponding pairs of vectors {x, y). Similarly, 
Qx\y{(^\^) = Qxy{0', b)/Qy{b) ^i^^ denote the empirical conditional probability of X = a given Y = b 
(with convention that 0/0 = 0), and Qx\y will denote {^a;|y('^l^)) a e X, be y}. Tx\y will denote 
the conditional type class of x given y. The expectation w.r.t. the empirical distribution of {x,y) 
will be denoted by Exy{-}, i.e., for a given function f : X xy ^JR, we define Exy{f{X,Y)} as 
X^(a b)eXxy Qxy{a, b)f{a, b), where in this notation, X and Y are understood to be random variables 
jointly distributed according to Qxy- The entropy with respect to the empirical distribution of 
a vector x will be denoted by H{x). Finally, the notation a„ = 6„ means that ^ log ^ ^ as 
n — oo. We start this section with the same initial step we used in the previous section. Namely, 
Gallager's general upper bound [14, p. 65] to the "channel" P{z\m) = P'i{A^m,i) ■ The 

average error probability w.r.t. the ensemble of codes for A > 0, p > is given by: 



^ My 



1-pA 



1=1 



E 



/ ^ My 



A>0,/9>0. 



(28) 
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We will see that both expectations depend on the z only through its empirical distribution. All the 
analysis is done for a given z. The summation over all possible empirical distributions of z is done 
in the last step. Ei{Qz, Ry, Ryz, p, ^) and E2{Qz, Ry, Ryz, P, ^) of Theorem 2 are the exponential 
rates of the first and second expectations in (28), respectively. After this initial step, our analysis 
is exponentially tight, whereas in the previous section, this is not necessarily the case. The price 
for this tightness is that the derivation and the resulting expression are much more involved, as we 
will see in the following subsections that derive Ei{Qz, Ry, Ryz, p, A) and E2{Qz, Ry, Ryz, P, A). 

5.1 'Deriving Ei{Qz,Ry, Ryz, pA) 



Let Nz,rn{Qx\z,u) be a type class enumerator, that is, the number of codewords within cloud m 
having the same empirical conditional probability Qx\z,u- 



E 



^ My 



l-pA 



MP^-'E^E,\^ 



My 



1=1 



l-p\ 



Qx\z,u 



MP^-^Eu 



Qx\z,u 



(29) 



The last exponential equality is the first main point in our approach: It holds, even before taking the 
expectations because the summation over Qx\z,u consists of a sub-exponential number of terms. 
Thus, the key issue here is how to assess the moments of the type class enumerator. 



21 



Note that the probabiHty, under P(x'*|'u") = HILi Pi^iWi)^ ^'^ ^^^^ ™to Tx\u,z is 

\Tx\uz\- n p(6|a)"^(«'M = e"(Ea;tiiogP(x|c/)+^(£c|z,u)) 
aeu,bex,cez 

Given tt, we independently generate e'^^y codewords under P{x^\vP) = HILi Therefore: 

The second main point of our approach is that the moments of the type class enumerator behave 
differently when the last exponent is positive or not (equivalently, Qx\z,u ^ ^i^y^Qulz) or not). 
By the same arguments as in [10, Appendix] 

E,iuN^;m'\Qx\z,u) (31) 

^ r ,n(l-pA)(i?,+E^„logP(X|a)+i^(^|Z,t.)) Q^|^_^eg(i?„Q^|^) 

\ ^niR,+^xu^o,PiX\U)+HiX\Z,U)) Qx\Z,U ^ ^'i^V^ Qu\z) ^ 

We require pA < 1 since the probability of {Nz^rn{Qx\z,u) = 0} is positive, and so, negative 
moments of Nz^m{Qx\u,z) diverge. The intuition behind this different behavior is that when 
Qx\z,u £ 0{Ry,Qu\z)j tii6 enumerator concentrates extremely rapidly (double exponentially fast) 
around its expectation. However, when Qx\z,u ^ ^^(-f^)Ow|z) the enumerator is typically zero, 
and thus the dominant term when calculating the moment is 1-'^"'''^ • P'r{Nz^'^{Qx\z,u^ ~ -'■)• 



We continue from (29) by splitting the sum over all conditional types to those that belong to 
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G{Ry,Qu\z) ^iid those that do not. 



''x\z,u 



=E„{ 



Q{Ry,Qu\z) 



,n{l-p\){i:.XU^ogP{X\U)+H{X\Z,U)+-E.zx'^ogP{Z\X)) 



+ 



E 



^n{{p\)Ry+t,xu^ogP{X\U)+H{X\Z,U)+{l-p\)tzx'^ogP{Z\X)) 



GHRy,Qu\z) 
=^^(e"«(Oti|z) ^ gn/3(Qtt|z)^ 

= max Pr((5^|;j;|z)(e""(^wiz) + g^^^^^l^)) 
Quiz 



(33) 



the last line is true since a{Qu\z) and (3{Qu\z) (cf- (6), (7)) depend on u through Qu\z- 
Pt^{Qu\z\^) is the probability, under P{u'") = Yli'=iP{ui), to belong to Tu\z which equals (ex- 
ponentially) to e"^^u^°sPiU)+H{u\z))-j_ j£ h.aye used Jensen's inequahty, instead of the above 
tight steps, the last sum would contain only e^"^^'^\^^ and the expression of a{Qu\z) would contain 
a global maximization rather than the constrained optimization of (6). Therefore, Jensen inequality 
is tight whenever the unconstrained achiever of Q;(Oit|z) ^ Q{Ry^ Qu\z) and (x{Qii^z) ^ P{Qu\z) 
(See [18, Appendix E] for more detains) 

We start by evaluating a{Qu\z)' The unconstrained achiever of the optimization in (6) is P{x\z,u) 
and it belongs to Q{Ry,Qu\z) for large enough Ry if Ry — I{x;z\u) > (Here, unlike the single 
user case [10], such Ry can be in the capacity region). If P{x\z,u) G Q{Ry,Qu\z) The maximum 
in (6) will be obtained with the empirical distribution Q{x\u, z) = P{x\u, z) (as n oo). 
We now consider the case in which P{x\z,u) G Q'^{Ry,Qu\z)- Following the exact arguments of 
[10, Section 4.3], any internal point of G{Ry,Qu\z) can be improved by a point on the boundary 
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of Q{Ry,Qu\z) when P{x\z,u) G Q'^{Ry,Qu\z)- The achieving pmf will thus be 

o-(xi...)= p(^i")P3'";"'(^-i^) (3,) 

where dR{u) is such that —Ry = Eq. log P{x\u)+Hq* {x\z, u) . The existence of 6r{u) is discussed in 
Section A. 2. Using the above arguments, since the constrained maximizer will be on the boundary 
of Q{Ry^Qu\z)^ we can use the fact that on the boundary —Ry = ¥iQ\ogP{x\u) + Hq{x\z,u) to 
get: 

a{Qu\z) = {^- P>){-Ry+ max ^zxlogP{Z\X)) (35) 

g(Ry,Qu\z) 

= (1 - p\){-Ry + Eq* \ogP{Z\X)) (36) 

To summarize, when P(x\z, u) G G{Ry, Qu\z) we have 

o^iQulz) = (1 - P>^)EP^\^,. logP{X\U) + Hp^^^ jX\U, Z) + Ep^^^^^ logPmX) (37) 
and when P{x\z^u) G Q^{Ry,Qu\z) we have 

c^{Qu\z) = (1 - P>^){-Ry + Eg. logP3(^|X)). (38) 

We now proceed by evaluating l3{Qu\z)- 
The unconstrained achiever of (7) is 

P{x\u)P^-P^{z\x) 



Qi^px{x\u, z) 



E^, P{x'\u)pI~p^[z\x')' 

Ry,{l — pX) will determine if Qi-px{x\u, z) G G'^{Ry,Qu\z)- From the proof of the existence of 
S{Qu\z) (Section A. 2) it is easily seen that the unconstrained achiever is outside ^'^(-R?/, Oit|z) 
when P{x\u, z) G G{Ry, Qu\z) or when 1 — pA < 5(0u|z)- this case, by the same arguments as 
before, the constrained achiever will be on the boundary and therefore: 

PiQulz) = (1 - pA) \-Ry + Eq. log Piz\x)\ (39) 
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where Q*{x\u,z) is defined in (34). 



In the case where Qi-p\{x\u, z) G G'^{Ry,Qu\z) ~ < ^{Qu\z))j for simpUcity, set c(l — 
pX, U, Z) = J2x P{X\U)P^~''^{Z\X). We have 

/3{Qu\z) = py^Ry + Eq,_^, \og[P{X\U)P^-''\Z\X)] + Hq^_^^{x\z,u) 

= pXRy + Eq,_^, [\og[P{X\U)P^-p\Z\X)\ - logQi_,A(^|f/, ^)} 

= pXB, + {iog[P(x|c/)P-M(z|x)] - fog ^^^i^?^^;^^} 

= pAPj/ + Buz fog c(l - pA, U, Z) (40) 

To summarize: 

^ ^ f (1 ^ pA)(-i?j, + Eq. log P{Z\X)) P{x\z, u) G g{Ry, Qu\z) or pA > 1 - <5(Qi,|;s) 
Pl<^w|2j I ^ E„,c(pA, ix, z) pA < 1 - 6{Qu\z) 

(41) 

And finally, letting E'q^ = max{a(Qtt|z), /?(Oit|z)}) substituting it into (33) and letting n — oo 
yields Ei{Qz,Ry,Ryz, p, A). 
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5.2 'Deriving E2{Qz,Ry,Ryz, P A) 



We now proceed to the second expectation of the original bound. 



E 



My 



m'^m \ " j=l 



--MyP^E 



Qx\z 



(42) 



Here, unhke the previous subsection, there are two main obstacles. The first is the inner sum over 
m! ^ m which has an exponential number of terms. In the previous subsection, when we used 
the enumerators, the resulting sums had only a polynomial number of terms, which allowed us to 
distribute the expectation operator and moments over the summands without loosing exponential 
tightness. Here we have to use a different approach. The second obstacle is that the enumerators, 
^zm'i'^x\z)j ^6 distributed differently for every m' (since the codewords are drawn given u'^). 
Note however, that for all Um that belong to the same conditional type Ty^^z the corresponding 
enumerators are identically distributed. We use this fact in the following. 



We continue by dividing [0, Ryz] into a grid with a sub-exponential number of intervals in n (for 
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E 



example, d = Evaluating the last expectation in (42), we have 

Ryz 

(number of times Nz^rn'{Qx\z) — e""^)e 
e^^P^E [(number of times N^^^,{Qx\z) = e"^) 



E 



^nXA 



A=0 

Ryz 



A=0 



(43) 



where = X i^Nz,m'{Qx\z) = e""^j, omitting the dependence on Qx\z to simplify notation). 

Next, we partition the summation over m' into subsets in which the enumerators are identically 
distributed as described above. 







P 


E 




= E 









Quiz 



Quiz 



E ^-'(^) 

."I'^Wm'eTuiz 



(44) 



Note that the number of terms in the inner summation of (44) is a random variable. Define 
"^Qu|Z ^ ' ^ '^u\z\ - number of cloud centers that belong to the same conditional 
type. Since we draw e"^^^ cloud centers independently with P{u'^) = 11^=1 (^j) have: 



E 



M. 



Qu\z 



,n{Ry,+H{U\Z)+-EulogPiU)) ^ gnm(Qu|z) 



The sign of the last exponent determines if we are likely to find an exponential number of cloud 
centers of this type. We show in Section A. 3 that when ?77,(Qu|z) > Qu|z ^ Gz{Ryz)), ^q.^^^^ 
converges to its expectation double exponentially fast. When m{Qu\z) ^ 0, Pr {J^q^^^ > '^^ 
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vanishes double exponentially fast. 

Let PAiQx\Zj Qu\z) = Pi' {-^m'(^) = 1} denote the probability that we have e""^ codewords around 
cloud m' that belong to Tx\z- Define 



A*iQx\ZyQu\z) = ^{Qx\z^Qu\z^Ry) 



+ 



We show in Section A.4 that when A = A*{Qx\z,Qu\z) > 0, Pa*(Qx\z,Qu\z)''^^\^'^'^\^'' 
verges to 1 and vanishes for every other A double exponentially fast. When A*{Qx\z-,Qu\z) = 0) 
we show that Pa=o{Qx\z, Qu\z) = e'^^^^x\Z'^'^\Z'^\ Thus, the outer summation in (43) consists 
only of those A*{Qx\zj Qu\z) the number of elements in the summation is upper bounded by 
IQoJlzl ^ |Qit|zl which is sub-exponential in n. 

Continuing (44), there are four cases: the combinations of Qu\z ^ Qz{Ryz) or not and 
A*{Qx\z:Qu\z) > or A*{Qx\z^ Qu\z) = 0- We start with the case A*{Qx\z:Qu\z) > 0- 



5.2.1 The case A = A*{Qx\z,Qu\z) > 



We need to evaluate: 



E 



."^'■Um'^Tu\z 



(45) 



We use the fact that for A = A*{Qx\z:Qu\z)j Pa{Qx\z^Qu\z) > 1 — e, for some e > that 
vanishes double exponentially fast (see Section A.4), to show that the probability that all the 
indicators, /^/(A), equal one is very likely. Denote this event by A: 



Pr(^) > (1 - e) '^u\z = e '^u\z ' >e '^u\z i- 



(46) 



Mr 



Quz * random variable in [0,e"^^^]. Since e vanishes double exponentially fast we have 
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Pr(^) 1 double exponentially fast. 



E 



Pt{A)E E Im'{A)\A 

■ p 

= E J2 Im'{A)\A 

P 



+ Pr(^^)£; 



E Im'{A)\A' 



E 



M, 



Quiz 



\a]' 



(47) 



In the second to the last line we used the fact that Pr(^'^) fast enough to make the second term 
in the summation negligible (note that the expectation value can grow, at most, at an exponential 
rate while Pr(^'^) vanishes double exponentially fast). In the last step we used the fact that given 
A, all the indicators are equal to one. Note that the conditioning on the event A introduces 
dependencies between the drawings of the codewords x and clouds u. (given A for instance, there 
might be some u £U which cannot be drawn, therefore the clouds are no longer drawn according 
to nr=i -P(^j))- We claim that since the conditioning in (47) is on an event which is very likely (its 
probability is very close to 1), we can remove the conditioning without changing much the resulting 
value. To see this. Let Ma be distributed with some distribution measure Q. 

^(^Qtxiz) = P'i^)QiMQ^,z\^) + P^iA'^)QiMQ^^JA'^) > (1 - e)QiMQ^^^\A) (48) 
on the other hand. 



^^^QuJ = ^<^)Q(^6...M) + PrWQ(Mo,., J^^) < Q{M^„JA) + e ■ 1. (49) 



Qu\z 



Quiz' 



Qu\z' 



therefore. 



Q(^Qu^z^-'^^^^0,.,M)< 



Qu\z' 



1-e 



(50) 



29 



Since e — double exponentially fast, we can replace Q{Mq^^^ \A) by Q{Mq^^^) in the calculation 
of the expectation in (47) and preserve exponential tightness. Using Section A. 3 for Qu\z ^ Gz{Ryz) 
we have: 



E 



M, 



Qu\z 



— I Qu\z — J I Qu\z — J 



On the other hand: 



E 



M. 



Quiz 



> gnp[m{Qu\z)-'^] Pj. [m- > e"('"(*5w|z)-e) 1 

— I Qu\z — J 

= gnp[m(Qtt|2;)-e] f-j^ _ , g'i(m(Qu|2:)-<:) \ \ 

I I Qu\z J J 



Finally we have for m{Qu\z) ^ 



E 



jn.p\m 



(Quiz)] 



When Qtiiz € Qz{Ryz) we have: 



M, 



Qu\z 



< 



(52) 



(53) 



gnpe pr |i < Mq^^^ < e'^"} + e""^^ Pr {m^^^^ > e'*'} (54) 



The second term vanishes since the probability that ^q^^^ > vanishes double exponentially 
fast for Qu\z ^ Gzi^yz)- Neglecting the second term and using the properties of ^^q^^^: proved 
in Section A. 3, we continue: 



E 



M. 



Qu\z 



(55) 



30 



On the other hand: 



E 



M, 



Quiz 



> 1 • Pr <! Ma. = 1} = e"'^(^wi^) 



Quiz 



(56) 



Therefore, since we can let e vanish sufficiently slowly with n, e.g. e = 1/\A^, we have for Qu\z € 



E 



M. 



Quiz 



= pn^{Qu\z) 



(57) 



To conclude this subsection, when A*{Qx\z-,Qu\z) > 0- 

- p 

rn':U^,eTuiz 



E 



npm(Qu\z) 



Qu\Z £ Qz{Ryz) 
^n^{Qu\z) Q e gc^R^^^ 



(58) 



5.2.2 The case A*{Qx\z,Qu\z) = 



Here, as before, we divide into two cases: Qu\z ^ Qz{Ryz) or Qu\z ^ Qz^Ryz)- Unlike the previous 

case, where we knew that Qu\z '^^^'^'^^S'-'^ double exponentially fast, here, we know that 
Pq{Qx\z,Qu\z) = e''^^^xiz,Qu\z,Ry) {N{Qx\z,Qu\z, Ry) < 0, see Section A.4). Therefore, we 
have to use a somewhat different approach. We start with the case of Qu\z ^ Qz{Ryz) 



E 



J2 ^rniO) 
jn'-U^"^Tu^Z 



< 



^np[m{Qu\z)+^iQx\ZyQu\Z'Ry)+'^]pj. 



\,m':U^,eTu\z 



^nRyz 



) ^ i^ii^Q-^ ■::> fP'i^iQu\z)+^{Qx\z^Qu\z^Ry)+<^) 

m':U^,eru\z 



> + 



(59) 
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Focusing on the probability in second term: 



Pr < 



m=0 



Pr < 



[m':U^,eTu\z 



Pr (Ma = e 



Pr < 



The last step is true because of the concentration of ^q^^^ around its expectation when Qu\z ^ 
Qz{Ryz)- Therefore Pr I 



''^iQu\z) 



-j^jvjQ^^^ — o " - '1^1 double exponentially fast (see Section A. 3). 
Here, as in the previous subsection, we condition on an event which is extremely likely. By the 
same arguments we used in the previous subsection, we remove the conditioning. Continuing (60) 
we have: 



Pr < 



m=l 



(61) 



We are left with analyzing the probability that we have more than 

f,n{.m(,Qu\z)+N{Qx\z,Qu\z,Ry)+^) gucccsses in e"™(^«l«) independent BernouUi trials with probability 
^nN{Qx\z^Qu\z^Ry) each. By using the Chernoff bound, it is easily seen that the probability that 
this will happen, vanishes double exponentially fast, since we have an exponential number of trials. 
We therefore have: 



E 



^'■■U^'&TU^Z 



< e 



p[n{rh{Qu\z)+N{Qx\Z'Qu\Z^Ry)+^)] 



(62) 
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The upper bound for Quiz ^ Qz{Ryz) is given by 



E 



> 



^p[n{m{Qu\z)+^iQx\Z^Qu\Z^Ry)-^)\ Pj. < 
= QP[n{rn{Qu\z)+^(Qx\Z'Qu\Z:Rv)-^)] x 

1 - Pr 



lm':U^,eTu\z 



J2 Im'iO) < e"^™(<5it|z)+Af((3a;|2,'3u|2,K!/)-e) K K (63) 

\,m':U^,eTu\Z J , 

By the same arguments we used in the upper bound, the last probabiHty vanishes double exponen- 



tially fast. So we have for Quiz ^ Gz{Ryz)' 



E 



m':U^,eTuiz 



= pnp[m{Qu\z)+^iQx\Z'Qu\Z'^))] 



(64) 



We now continue to the case Qu\z € Q^{Ryz)- Here, we know that ^q^^^ is sub-exponential 
(the probability that i^ sub exponential converges to 1 double exponentially fast). There- 

fore, we will not be able to apply the Chernoff bound as we did before in (61). Again, we use a 
different approach. 



E 



E ^^'(0) 

rn':U^'eTu\z 



E ^rniO) 

jn'-u^'eTu^z 



E ^-'(0) 



Ma < e"^ 



Ma > e"" 
Qu\z ~ 



(65) 



The second term can be neglected since the Pr |Mt^|^ > e"^| vanishes double exponentially fast 
for Qu\z ^ ^zi^yz) the expectation grows at most at an exponential rate. Since we know that 
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the number of elements in the sum over m! is of sub exponential order, we can distribute p over 
the summands and still preserve exponential tightness. 



E 



E C(o) 



Ma < 
Qu\z 



(66) 



We now condition on ^q^^^ ■ Doing this alone would introduce dependencies between the w's and 
X and change the probability law of the indicator function. To avoid this, we condition also on Um'- 
Given a specific w^/ all drawing of x^i^i are independent and Pa=o{Qx\Zj Qu\z) remains intact. 



Qu\z 



(67) 



ym':U^,eTu\z 

Given u the inner expectation is independent of the number of such it's (-^q^i^) becomes 
Pa=o{Qx\z-:Qu\z)- Now, since Pa=o{Qx\z-:Qu\z) is constant for all w's in the conditional type 



Tii\z the expectation over u doesn't change the value and we are left with: 

■ p 

E ^rn'iO) 



MMQu\z)+N{Qx\z,Qu\z,Ry)) 



(68) 



rn':U^,eTuiz 

To summarize this subsection: When A*{Qx\z:Qu\z) = we have 

^ J gnp[m{Qu\z)+N{Qx\Z,Qu\Z,Ry)] Q g g,{Ry,) 



E 



E ^-'(^) 

T^'-Um'&Tu^Z 



^n[rh{Qu\z)+N{Qx\Z,Qu\Z,Rv)\ Q g GURyz) 



(69) 



5.2.3 Wrapping up 



Using the results we obtained in the previous two subsections, we are now ready to continue (43). 



E 



E ^lm'{Qx\z) 



A>0 



EE 



Qu\z 



E ^rn'{A) 
_m':U^,eTuiz 

E ^rn'{A) 
m':U^,eTu\Z 



(70) 
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We saw that for all A ^ A*{Qx\ZjQu\z) the inner sum vanishes. Using definitions (12) and (13) 
we continue: 



e'^^pA''iQx\z<Qu\z)E 

Qu\z 



m':U^,(iTu\Z 



= ^ f,riiB{Qx\Z'Qu\Z'Ry)+P^iQu\z)) _|_ ^ Qn(C{Qx\Z'Qu\Z'Ry)+'^iQu\z)) 

Qu\z<^G4Ry-) Qu\z^5S{Ryz) 
^ n-max^^max^^^^^g^^j^^^^[B{Qx\z^Qu\Z'Ry)+P^{Qu\z)l'^^Qy^^^ 



^ e''E{Qx\z)_ 



(71) 



Substituting this into (42), we have: 



E 



My 



^ -njmax^^^^ \pEzX P(z\x) -^^'^X\z)+P^Ry^ 



(72) 



When n — oo, this is the expression of E2{Qz, Ry, Ryz, P, A) of Theorem 2. 



5.3 The Strong Decoder 



We now proceed to the derivation of the strong decoder exponent. We start with the same steps 
as in the Gallager-type approach (22): 

\ P 



Pl{y\Xm,i)^ 



My \ P 



y m'ytmi'=l 



= EY,P^{y\x^,,)^-^P 

y 

^EPe^, + EPe^, 



My 



^Pi{y\xm,i'y 



Yl YPliy\^rn',i'y 
ym'^m i'=l 



(73) 
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As before, we evaluate the expressions for a given y and sum over all y in the last step. We start 
with Pe„ 



y 



p 

X 



p 

A 



(74) 



The first expectation becomes: 

EP^{y\X^^^''-^P = E^E^\My\Xm,i)^-^'' 

Qx\uy 



= E. 



u 



max (Q^^uy) e^'^'-'^^^y^'"' 



,gp(y|x) 



Qx\uy 

— max rr I Qu^y j e ' " 

Qu\y 

= max e-^^u^°^^iU)+Hium^-^^Qx\uy^^^^'''^^^^^^^ 
Qu\y 

= max max (.<'^ux^ogP{U,X)+H{x,u\y)+{^-pmyx^'^&Piy\X)) 
Qu\y Qx\uy 

= max e''^^ux^osP{U,x)+H{x,u\y)+{i-pX)tyxiogP{Y\x)) ^^^^ 
Qx,u\y 

The last exponent is E^^Qy, Ry, Ryz-, P-, ^) of Theorem 2 as n ^ oo. The derivation of the exponent 
of the second expectation is quite similar to the steps of following (29) in the weak decoder exponent. 
We therefore only outline the derivation here. For the second expectation we have: 

E{Y,Pi{y\Xm,')'\ =E^E,\^{y: iV,,^(gx|ny)e"'^?/«=^°^^™ 

yQx\uy / 



\Qx\uy ) 



(76) 
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As in the case of the weak decoder we define: 

GiRy, Qu\y) = {Qx\u,Y -Ry + EQ log P(X|C/) + HQiX\U, Y) > O} (77) 

and we have 

Ex\uNy,miQx\Z,u) 



Now define: 



i{Qu\y) = p[Rv + ^ max {EQ\ogP{X\U) + HQ{X\Y,U) + \EQ\ogP{Y\X))\ (79) 

\ Qx\u,Y^y\Ry:Qu\Y) I 

where, as described in Section 2, -Pi(-|-) is the channel to the strong user. Similarly, define: 

aQu\Y) = Ry + ^ max [E glog P {X\U) + Hq{X\U,Y) + {p\)EQ\ogP{Y\X)\ (80) 

We now continue (76) by splitting the sum over all Qx\uy mto Qx\uy ^ G{Ry, Qu\y) Qx\uy ^ 

g'^{Ry, Qu\y)- 

p 

^n-y[Qu\y) _|_ gnC(Qwiy) 



J]Pi(y|X^,,OM =Eu 
J 

= max Pr (Qu\y) [e^^^^^l^/^ + e^'^^'^^wA (81) 
Qu\y 

We begin with the evaluation of 7(<5it|y). The unconstrained achiever in (79) is: 

^ , , , P(x\u)P^(y\x) 

Ea:' P{x'\u)P^{y\x') 

If Q;^(a;|u, y) G G{Ry,Qu\y) than we can calculate 7(Q[/|y) with it. If Qx{x\u,y) G G^{Ry,Qu\Y) 
Since (5A=o(a^|it) 2/) £ Q{Ry, Qu\y)i we know that Q{Ry, Qu\y) is not empty, and there is a S{Qu\y) G 
(0, A) for which Qs{Q^^y) '^^ boundary of Qu\y- before, our constrained optimizer is on 
the boundary. So we have for j{Qu\y)- 



37 



i{Qu\v) = 

p (Ry + Eq, \ogP{X\U) + Hq^{X\Y, U) + \Eq^ logP(y|X)) Q>.{x\u, y) e g(Ry, Qu\y) 

^^^«^(0,„„) ^°SP{Y\X) Qx{x\u,y) € g%Ry,Qu\y) 

By the same arguments: 



(82) 



C{Qu\y) 



Qpxix\u,y) € G{Ry,Qu\y) 



(83) 



Ry + Eq^^ \ogP{X\U) + Hq^^ {X\Y, U) + p\Eq^^ log P(y|X) Q^A(a;|M, y) £ 6%Ry,Qu\y) 
Letting E^(^{Qu\y) be the dominant term between '^{Qu\y) C(Ou|y) )the second expectation 



of PEy^ is: 



/■s-^ ,\ . nmax^ (E^a(Q7.|7,)+E7, log P(C/)+if(W|i/)) 



(84) 



the last exponent is E4{Qy, Ry, Ryz, P, A) of Theorem 2 as n — oo. 



We now proceed to the evaluation of: 



y 



m'y^m i'=l 



(85) 



The fist expectation is the same as before. For the second expectation, following the same steps as 
is (42) we have 



E 



My 



m'^m i'=l 



^ ^ ^nXpEy X log PiiY\X)^ 

Qx\y 



Nz,m'{Qx\y) 



(86) 



and by the arguments that led to (43) we have: 

P R 



E 



A 



Y ^z,m'{Qx\y) 



A>0 



(87) 



where, here, Im'{-A) = 1 [Nz^m'{Qx\y) = e""^j (as before, we omit the dependence on Qx\y to 
simplify notation). The only difference between (87) and (43) is that here only p multiplies A 
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in the exponent whereas in (43) we had p\ multiplying A. This fact will change the final result, 



however, the evaluation of E 



is identical to the weak decoder case by replacing 



the role of z with y and Ps{Z\X) with Pi{Y\X). We therefore have: 



E 



XI ^z,m'{Qx\y) 



^ r^E{Qx\y) 



(88) 



and for the second expectation we have: 



E 



m'^m i'=l 



■_ /™^^Qa;iy l>^p^yx^<^sPi(.y\x)+E{Qx\y} 



(89) 



the last exponent is E^{Qy-, Ry, Ryz, P, A) of Theorem 2 as n — >^ infty Taking the maximum of and 
and using we arrive at Ey^2{Ryz, Ry) after optimizing over the free parameters. 

5.4 Numerical Results 

In this subsection, we revisit the same setup as in Section 4.3. We show some numerical results 
of the error exponents obtained by the type class enumerators approach and compare them to the 
exponents of our Gallager type approach and to Gallager's results [4]. Unlike the calculation of the 
numerical results of Section 4, which, after setting a = fx had a straightforward implementation and 
reasonable computation time, here the calculation is much more complex. For every p, A searched, 
we need to optimize over Q{u\z),Q{x\z) in the intermediate steps 71,72 and finally over Q{z). In 
Fig. 5, we show the best attainable Ez{Ry, Ryz) (maximized over /3) for two values of Ry, compared 
to results in [4] and of Section 4. In both cases, although we confined p to [0, 1] in order to limit the 
computation time, the new exponents are better. We used Ey that was derived in Section 4 and 
allowed it to be arbitrarily small (yet positive) , thus complying with the definition of an attainable 
exponent for the weak user. 



In both plots of Fig. 5, the exponent becomes zero when the pair {Ry,Ryz) is outside the 



39 




40 



capacity region. The improvement gained by the type class enumerators approach is more sub- 
stantial when Ry is small. As discussed in [18, Appendix E], when the number of elements in the 
sum of likelihoods (28) is large enough, Jensen's inequality becomes tighter and the results of the 
Gallager-type approach will be closer to the tight approach results. 

A Appendix 

A.l proof of A = when a — ji 

It will be shown bellow that 

VA : £^o(p, > £^o(p, A,a,a) 

where £^o(P) A, a, a) was defined in (1). We use the following variant of Holder's inequality [15, 
p. 523]: Let ai,bi,Pi be non negative numbers defined over a finite set of i with Yli^i = 1 
< 7 < 1 



1-7 



(90) 



We have for the weak decoder: 



E{Ri,R2)= max max max {EQ{p,X,a,fj,) — {a + pfx—l)Ri—pR2} 

0<p<l 0<A</U<1 l—pX<a<l 



where 



-log E 



E^i(^) EQ2(x|n)P3(^|x)(^-^^)/" 

U \ X / 
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Substituting a = /j,, {max{X, 1 — Xp) < a < 1) we have for Eq: 



Eo{p,X,a,a) = -log<^ 



U \ X / 



(91) 



Finally, 



2 I, u \ a; / 



a \ l+p 



The proof holds for 1 > p > 0. Since when p = (note that in this case q = 1) we have for all A: 
Eo{p = 0, A, 1, 1) = 0, this is sufficient for our case. 



Proof. Let us observe the inner term of Eo{p, a, a): 



(E^2(x|u)P3(z|x)V°(1+p) j 



(92) 



It is sufficient to show, that for every z, this term lower bounds the same term with A instead of 
^ (as in (91)). 

l-Ap 

To Start, we use (90) with the following assignments: Pi = Q2{x\u),ai = P3{z\x)°'('^+p'i ,bi = 
Ps{z\x)°'^''-+p> . Applying this we have for < S < 1: 

|^Qi(u)('^Q2(x|«)P3(z|x)V"a+p)^ I < 



< i 



Xp 



'^Q2{x\u)P3{z\x)^^(^+p)^ ^^Q2(a;|«)P3(2;|x)(i-^)«W) 



a \ 1+p 



• (93) 
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At this point we use (90) again over the whole term with the following assignments: 



Pi = Q{u) 

0.1= \ y2Q2{x\u)Pi{z\x)'^ 

hi = I y^^Q2{x\u)P2,{z\x) 

\ X 



i-\p 
(l+p) 



a{l-S) 



(l-6)c,(l+p) 



Continuing from(93): 



< < 



l-Ap 



E„ Q2{X\U)PS{Z\X) 



(5ck/7 



7 \ 1+p 

X 



Ql W (Ex Q2{x\u)P3{z\x)l^^^mpl] ^ 



-7) 



1-7 



< 7 < 1. 



Assigning 7 = 5= we have: 



(l-pA)/a 



Note that the last term is equivalent to (92) when A = and greater or equal for every other 



value of A. Since this is true for every z the proof is completed. 



□ 



A.2 The Existence of 5{Qu\z] 



We need to show that for Qu\zt there exist a S{Qu\z) such that, when P{x\u,z) G G'^{Ry,Qu\z)t 
the partition function of Q'^{Ry, Quiz) is zero. Namely: 



Ry + EQlogP{X\U) + Hq{X\Z, U)=0 



(94) 



where the above entropy and expectation are calculated with respect to 



Q{x, u, z) = Q*{x\u, z)Qu\z{u, z)Qz{z) 
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{Q*{x\u,z) is defined in (34)). 

Denote C((5((5ti|2), 2) = ■^i^\'^)^3^'°^ ""^^ ' define 



9{SiQu\z) = Ry + EQ\ogP{X\U) + HQ{X\Z,U) 

P{X\U)C{5{Qu\z).u,z) 



= Ry + Eq log 



Ry + S{Qu\z)Eq log + Euz log C{5{Qu\z,u, z) (95) 



P(X|C/)P3^'^"l^(Z|X) 

1 

P{Z\X) 



For P(x|tt, z) G ^'^(-Ri/) <3w|z)) ^(l) < ^'Hd since Ry > 0, 5(0) > 0. Therefore, because of the 
continuity of g{S{Qu\z), we conclude that there exist S{Qu\z) ^ [0) 1) such that g{S{Qu\z)) = 0- 
It can be shown that g{S{Qu\z)) is non increasing for 6 > 0. 

A.3 The Behavior of Ma 

Quiz 

i=l 

The probability that a cloud center w^, drawn with P{u^) = HiLi -^'('"j) will belong to rn|2; is 
(exponentially) e<^u^''^nu)+H{u\z) _ Using L>(a||6) > (in f - l) ([10, Appendix]) and the Chernoff 
bound we have: 

Pr(MQ^^^ > e"-") < exp {-ne"'" [a - i?^,. - H{u\z) - E« log P(t/)] } a > i?^, + + Eu logP(Z7) 

Pr(MQ^^_^ <e" ")<exp{ne" ''[a-J?^. -^(M|z)-EulogP(Z7)]} a < i?^. + + Eu logP([7) (97) 

Therefore, for Qu\z € Gz{Ryz), e > 0: 

PrCM- = e"('f^^+^("l-^)+^wi°S'P('^))) = 1 - Pr(M- > e"(^!'^+-^(«l^)+^i°s^(«)+^)) 
^ Qu\z ' ^ Qu\z — ' 

- Pr(M' < e.'^iRyz+H{u\z)+E\ogP{u)-e)\ 

^ Qu\z — ' 

> 1 - 2e-"«e"^^^^+^'"''''+^'°^^'"'"'' (98) 
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And thus, for Qu\z ^ Qz{Ryz)i ^q^^^, ^^^'^^''S^^ expectation double exponentially fast. It is 
obvious from (97) that when Qu\z ^ GziRyz), we wont find an exponential number of cloud centers 
of this type. Furthermore, the dominant term in EMq^^^ will be 1 • 'Pv{Mq^^^ = 1). We now 
show the exponential behavior of Mq when Qu\z ^ ^zi^yz) 



Pr(M 



Qu\z 



^ ^nRyz^n{H{U\Z)+EulogP{U))^^ _ gn(^(W|Z)+E„ logP([/))^e"%- -1 
< gniiy^ gn(^(W|Z)+Eu log P(«)) 
^ gnmiQu\z) 



(99) 



Pr(M 



^ gnRj;,gn(^(tt|Z)+EulogP(C/))^^ _ gn(H(W|Z)+Eii logP([/))^e"«!/- -1 
= gn'n(Qu|2;)^j _ gn(^(tt|Z)+EulogP(!7)))e"%- 

= ^nm{Qu\z) exp [log(l - e'^(^(^l^)+^w ^(^)))e"-^!'^ 
exp 



> Q^^T-iQuiz) , 



_gn(if(U|Z)+EttlogP({/)) 
I _ eniH{U\Z)+EulogP{U)) 



^nfn{Qu\z) exp 



1 _ en(H(W|Z)+EulogP(C7)) 



(100) 



(101) 



where in (100), we used log(l +x) > and the last line is true since e""*'''^'"!'^^ — when n — oo 
for Ott|z € ^zi-^z)- To conclude, we have: 



Pr(Mr^l^ = 1) = e"'^(^«l^) 



(102) 
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A.4 Deriving PA((5a;|z, Quiz) 

For a given u*, the probability of drawing x with P(a;|tt) which will belong to Tx\z is 

n 

^ p{x\u*)= n^(^^i<) 

xeT^^^ xex,|^ i=i 

n 

= E l^-k>«*l n Piblar^^""^'^'^ (103) 

where P{a, b, c) is the joint empirical distribution of the triplet a & U,b & A^,c & Z. Note that for 
different x G T^\z, P{a, b, c) have different values. Exponentially, the behavior will be according to 
the maximal element. Namely: 

= e"-'^-T,|.,JT,|,{EiogP(x|«)+i?(a;|z,u)} ^^^^^ 

The last expression remains true for all permutations of u* which belong to Tu*\z- This is because 
we can apply the same permutation to the x vector and get the same value in the exponent. This 
value will be the maximizer since the range of the maximization remains constant while u belongs 
to the same Ty^*\z- for a given u G (if there is such a it in our random codebook) we draw 

gW-Rj, gj series independently according to 11^=1 Pi^iWi)- Therefore, the average number of x that 
will belong to Tx\z when u belongs to T^^^ is 

Since we are evaluating the probability of drawing an exponential number of x which will be- 
long to Tx\z we are only interested in the case where the last exponent is positive. By the 
same arguments in Section A. 3, when N{Qx\z:Qu\z^ ^) > number of {xm} which will 
belong to Tx\z concentrates double exponentially fast around the expectation (105). Therefore, for 
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N^Qx\z,Qu\z,Ry)>0,e>0: 



Pr {l [N,,m'{Qx\z) = e"^(^a;i^'^«l^'^-)) = l} 



> 1 - 2e 



—nee 



(]V(Qa;|2;.Qti|2;,%)-e) 



(106) 



To conclude, Pa,Tu\z either vanishes double exponentially fast if A ^ ^{Qx\ZiQu\ZiRy) or con- 



verges double exponentially fast to 1 if = N{Qx\z-, Qu\Zj P-y)- 



When the exponent in (105) is negative, for every ^ > Pa{Qx\z^Qu\z) vanishes double expo- 



nentially fast. However, for A = 0, by the same arguments as in section A. 3 we show that 



Pr{N,,^,{Qxiz) = e"°} = Pr [l < N,,m'iQx\z) < e"^} = Pr {N,,m'iQx\z) = l} (107) 



References 

[1] T. M. Cover, "Broadcast channels," IEEE Transactions on Information Theory, vol. 18, no. 1, 
pp. 2-14, January 1972. 

[2] J. Korner and K. Marton, "General broadcast channels with degraded message sets," IEEE 
Transactions on Information Theory, vol. 23, no. 1, pp. 60-64, January 1977. 

[3] P. P. Bergmans, "Random coding theorem for broadcast channels with degraded components," 
IEEE Transactions on Information Theory, vol. 19, no. 2, pp. 197-207, March 1973. 

[4] R. G. Gallager, "Capacity and coding for degraded broadcast channels," Problemy Peredachi 
Informatsii, vol. 10, no. 3, pp. 3-14, 1974. 



and 




(108) 



47 



[5] L. Weng, S. S. Pradhan, and A. Anastasopoulos, "Error exponent regions for gaussian broad- 
cast and multiple-access channels," IEEE Transactions on Information Theory, vol. 54, no. 7, 
pp. 2919-2942, July 2008. 

[6] J. Korner and A. Sgarro, "Universally attainable error exponents for broadcast channels with 
degraded message sets," IEEE Transactions on Information Theory, vol. 26, no. 6, pp. 670-679, 
November 1980. 

[7] G. D. Forney, "Exponential error bounds for erasure, list, and decision feedback schemes," 
IEEE Transactions on Information Theory, vol. 14, no. 2, pp. 206-220, March 1968. 

[8] M. Mezard and A. Montanari, Constraint Satisfaction Networks in Physics and Computation. 
Oxford University Press, 2009. 

[9] N. Merhav, "Relations between random coding exponents and the statistical physics of random 
codes," IEEE Transactions on Information Theory, vol. 55, no. 1, pp. 83-92, January 2009. 

[10] , "Error exponents of erasure/list decoding revisited via moments of distance enumer- 
ators," IEEE Transactions on Information Theory, vol. 54, no. 10, pp. 4439-4447, October 
2008. 

[11] R. Etkin, N. Merhav, and E. Ordentlich, "Error exponents of optimum decoding for the inter- 
ference channel," in Proceeding of the International Symposium on Information Theory, 2008, 
pp. 1523-1527. 

[12] K. Marton, "A coding theorem for the discrete memoryless broadcast channel," IEEE Trans- 
actions on Information Theory, vol. 25, no. 3, pp. 306- 311, May 1979. 



48 



[13] A. El Gammal and E. C. van der Meulen, "A proof of marton's coding theorem for the discrete 
memoryless broadcast channel," IEEE Transactions on Information Theory, vol. 27, no. 1, pp. 
120-122, January 1981. 

[14] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Coding. McGraw- 
Hill, 1979. 

[15] R. G. Gallager, Information Theory and Reliable Communication. Wiley, 1968. 

[16] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. Wiley, 2006. 

[17] I. Csiszar and J. Korner, Information Theory: Coding Theorems for Discrete Memoryless 
Systems. Academic Press, 1981. 

[18] Y. Kaspi, "Error exponents for broadcast channels with degraded message sets," Master's 
thesis, Technion - Isreal Institute of Technology, Haifa, Israel, April 2009. 



49 



